perm filename LISP[E80,JMC] blob
sn#525103 filedate 1980-07-23 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00009 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 .require "memo.pub[e80,jmc]" source
C00004 00003 .<< Introduction>>
C00007 00004 .<<survival>>
C00015 00005 .<< Improvements>>
C00021 00006 .<< Proving Correctness of LISP Programs>>
C00036 00007 .<<remarks>>
C00041 00008 .<< References>>
C00044 00009 .if false then begin "supplement"
C00045 ENDMK
C⊗;
.require "memo.pub[e80,jmc]" source
.cb LISP - NOTES ON ITS PAST AND FUTURE
Abstract: LISP has survived for 21 years because it is
an approximate local optimum in the space of programming
languages. However, it has accumulated some barnacles
that should be scraped off, and some long-standing
opportunities for improvement have been neglected.
It would benefit from some co-operative maintenance
especially in creating and maintaining program libraries.
Computer checked proofs of program correctness are now
possible for pure LISP and some extensions, but more theory
and some smoothing of the language itself are required before
we can take full advantage of LISP's mathematical basis.
.<< Introduction>>
.skip 2
.bb Introduction
On LISP's approximate 21st anniversary, no doubt something could
be said about coming of age, but it seems doubtful that the normal life
expectancy of a programming language is three score and ten. In fact,
LISP seems to be the second oldest surviving programming language after
Fortran, so maybe we should plan on holding one of these newspaper
interviews in which grandpa is asked to what he attributes having lived to
100. Anyway the early history of LISP was already covered in
(McCarthy 1977) which was given at the ACM conference on
the history of programming languages.
Therefore, these notes first review some of the salient features
of LISP and their realtion to its long survival, noting that it has never
been supported by a computer company. LISP has a partially justified
reputation of being more based on theory than most computer languages,
presumably stemming from its functional form, its use of lambda notation
and basing the interpreter on a universal function.
From the beginning, I have wanted to
develop techniques for making computer checkable proofs of LISP programs,
and now this is possible for a large part of LISP. Still other present
and proposed facilities are in a theoretically more mysterious state. I
will conclude with some remarks on improvements that might be made in LISP
and the prospects for replacing it by something substantially better.
.<<survival>>
.skip 2
.bb The Survival of LISP
As a programming language, LISP is characterized by the following
ideas:
#. Computing with symbolic expressions rather than numbers.
#. Representation of symbolic expressions and other information by
list structure in computer memory.
#. Representation of information in on paper, from keyboards
and in other external media mostly by
multi-level lists and sometimes by S-expressions. It has been
important that any kind of data can be represented by a single general
type.
#. A small set of selector and constructor
operations expressed as functions, i.e. ⊗car, ⊗cdr and ⊗cons.
#. Composition of functions as a tool for
forming more complex functions.
#. The use of conditional expressions for
getting branching into function definitions.
#. The recursive use of
conditional expressions as a sufficient tool for building computable
functions.
#. The use of λ-expressions for naming functions.
#. The storage of information on the property lists of atoms.
#. The representation of LISP programs as LISP data that can
be manipulated by object programs. This has prevented the separation
between system programmers and application programmers. Everyone can
"improve" his LISP, and many of these "improvements" have developed
into improvements to the language.
#. The conditional expression interpretation of Boolean connectives.
#. The LISP function %2eval%1 that
serves both as a formal definition of the language and as an interpreter.
#. Garbage collection as the means of erasure.
#. Minimal requirements for declarations so that LISP statements
can be executed in an on-line environment without preliminaries.
.tall; next page
#. LISP statements as a command language in an on-line environment.
Of course, the above doesn't mention features that LISP has in
common with most programming languages in its "program feature".
All these features have remained viable and the
combination must be some kind of approximate
local optimum in the space of programming languages, because LISP has
survived several attempts to replace it, some rather determined. It
may be worthwhile to review a few of these and guess why they didn't
make it.
1. SLIP included list processing in Fortran. It used
bidirectional lists and didn't allow recursive functions or
conditional expressions. The bidirectional lists offered
advantages in only a few applications but otherwise took up
space and time. It didn't encourage on-line use, since Fortran
doesn't.
2. Formac was another Fortran based language that was pushed
for a while by part of IBM. It was dedicated to manipulating
a class of algebraic formulas written in Fortran style and was also
oriented to batch processing.
3. Formula Algol was dedicated to the linguistic pun
that the elementary operations can be regarded as operating
on numbers or on formulas. The idea was that if a variable
⊗x has no value, then operations on expressions involving
⊗x must be regarded as operating on the formula. A few programs
could be written, but the pun proved an inadequate basis for
substantial programs.
4. One of the more interesting rivals to LISP is (or was)
POP-2. It has everything that LISP has except that its
statements are written in an Algol-like form and don't have
any list structure internal form. Thus POP-2 programs can
produce other POP-2 programs only as character strings.
This makes a much sharper distinction between system programmers
and application programmers than in LISP. In LISP, for example,
anyone can make is own fancy macro recognizer and expander.
5. Microplanner is an attempt to make a higher level
general purpose language than LISP. The higher level involves both
data (pattern matching) and control (goal seeking and failure
mechanisms). Unfortunately, both proved inadequately general,
and programmers were forced to very elaborate constructions, to
new languages like CONNIVER with even more elaborate control
features, and eventually many went back to LISP.
One generic trouble seems to be that no-one adequately understands
pattern directed computation which always works very nicely on
simple examples, but which leads to over complicated systems
when generalized. We can see this in LISP in certain macro expansion
systems like that of the LISP machine (Weinreb and Moon 1978).
6. I should mention Prolog, but I don't understand it
well enough to comment.
.<< Improvements>>
.bb Improvements
Like most everything, LISP is subject to improvement. The
various versions of LISP have accumulated many barnacles with time,
and these would have to be scraped off
before a definitive standardizable language could be achieved - a
worthwhile but long term goal.
Meanwhile here are a few directions for improvement. Some are
purely operational and others have more conceptual content.
.item←0
#. Incorporating more standard functions into the language and
rationalizing the standard functions in the present versions.
Designers of programming languages often propose omitting
from the definition of the language facilities that can be defined
within the language on the grounds that the user can do it for
himself. The result is often that users cannot use each others
programs, because each installation and user performs various
common tasks in different ways. In so far as programmers use
local libraries without rewriting the functions, they are using
different languages if they use different local libraries. Compatibility
between users of LISP would be much enhanced if there were more
standard functions.
#. Syntax directed input and output.
A notation for representing symbolic information can be optimized
from three points of view: One can try to make it easy to write.
One can try to make it easy and pleasant to read. One can make
easy to manipulate with computer programs. Unfortunately, these
desiderata are almost always grossly incompatible. LISP owes most
of its success to optimizing the third. LISP lists and S-expressions
in which the ⊗car of an item identifies its kind have proved most
suitable as data for programming.
When the amount of input and output is small, users are inclined to
accept the inconvenience of entering the input and seeing the output
as lists or S-expressions. Otherwise they write ⊗read and ⊗print programs
of varying elaborateness. Input and output programs are often a large part
of the work and a major source of bugs. Moreover, input programs often
must detect and report errors in the syntax of input.
LISP would be much improved by standard facilities for syntax
directed input and output. Some years ago Lynn Quam implemented a
system that used the same syntax description for both input and output,
but this was rather constraining. Probably one wants different
syntaxes for input and output, and input syntaxes should specify
ways of complaining about errors. The idea is to provide standard
facilities for a programmer to describe correspondences
between data in an external medium and S-expressions, e.g. he should
be able to say something like
(PLUS ⊗x ... ⊗z) → ⊗x + ... + ⊗z,
(DIFFERENCE ⊗x ⊗y) → ⊗x - ⊗y,
although I hold no brief for this particular notation.
#. Syntax directed computation in general.
It isn't clear whether this would be a feature to be added to
LISP or a new language. However, it seems likely that both the functional
form of computation that LISP has now and syntax directed features are
wanted in one language.
#. LISP might benefit if we could find a way to finance and
manage a central agency that could keep libraries, make agreed on
machine independent improvements, maintain a standard subset, and
co-ordinate pressure on computer manufacturers to develop and maintain
adequate LISPs on their machines. It shouldn't get too powerful.
.<< Proving Correctness of LISP Programs>>
.skip 2
.bb Taking Advantage of LISP's Theoretical Foundation
As soon as pure LISP took its present form, it became apparent
that properties of LISP functions should be provable by algebraic
manipulation together with an appropriate form of mathematical
induction. This gave rise to the goal of creating a mathematical
theory of computation that would lead to computer checked proofs
(McCarthy 1962)
that programs meet their specifications.
Because LISP functions are %2prima facie%1 partial
functions, standard logical techniques weren't immediately
applicable, although recursion induction (McCarthy 1963)
quickly became available as an informal method.
The methods of Kleene (1952) might have been adopted to proving
properties of programs had anyone who understood them well been
properly motivated and understood the connections.
The first adequate formal method was based on Cartwright's
(1977) thesis, which permits a LISP function definition such as
.begin nofill
.skip 1
.at "q[" ⊂"%4 %1[%2"⊃
.turn on "∂"
%2appendq[u,v%1] _ ∂(16)%2qif null u qthen v
∂(16)qelse consq[car u, appendq[cdr u, v%1]]
to be replaced by a first order sentence
∀%2u v.q[appendq[u,v%1] = ∂(21)%2qif null u qthen v
∂(21)qelse consq[car u, appendq[cdr u, v%1]]]
.end
without first having to prove that the program terminates for any
lists ⊗u and ⊗v. The proof of termination has exactly the same
form as any other inductive proof. See also (Cartwright and McCarthy 1979).
The Elephant formalism (McCarthy 1981 forthcoming) supplies a second
method appropriate for sequential LISP programs. Boyer and Moore
(1979) provide proof finding as well as proof checking in a different
formalism that requires a proof that a function is total as part of
the process of accepting its definition.
I should say that I don't regard the LCF methods as adequate,
because the "logic of computable functions" is too weak to fully specify
programs.
These methods (used informally) have been succesfully taught as
part of the LISP course at Stanford and will be described in the textbook
(McCarthy and Talcott 1980). It is also quite feasible to
check the proofs by machine using Richard Weyhrauch's FOL interactive
proof-checker for first order logic, but practical use requires a LISP
system that integrates the proof checker with the interpreter and
compiler.
The ultimate goal of computer proof-checking is a system
that will be used by people without mathematical inclination
simply because it leads more quickly to programs without bugs.
This requires further advances that will make possible shorter
proofs and also progress in writing the specifications of programs.
Probably some parts of the specifications such as that the
program terminates are almost syntactic in their checkability.
However, the specifications of programs used in AI work require new
ideas even to formulate. I think that recent work in non-monotonic
reasoning will be relevant here, because the fact that an AI program
works requires jumping to conclusions about the world in which it
operates.
While pure LISP and the simple form of the "program feature"
are readily formalized, many of the fancier features of the operational
LISP systems such as Interlisp, Maclisp and Lisp Machine LISP are harder
to formalize. Some of them like FEXPRs require more mathematical
research, but others seem to me to be kludges and
should be made more mathematically neat both so that properties of
programs that use them can be readily proved and also to reduce ordinary
bugs.
The following features of present LISP systems and proposed
extensions require new methods for correctness proofs:
.item←0
#. Programs that involve re-entrant list structure. Those that
don't involve ⊗rplaca and ⊗rplacd such as search and print programs are
more accessible than those that do. I have an induction method on finite
graphs that applies to them, but I don't yet know how to treat ⊗rplaca,
etc. Induction on finite graphs also has applications to proving theorems
about flowchart programs.
#. No systematic methods are known for formally stating and
proving properties of syntax directed computations.
#. Programs that use macro expansions are in principle doable
via axiomatizations of the interpreter, but I don't know of any actual
formal proofs.
#. No techniques exist for correctness proofs of programs involving
lazy evaluators.
#. Programs with functional arguments are in principle accessible
by Dana Scott's methods, but the different kinds of functional arguments
have been treated only descriptively and informally.
#. Probably the greatest obstacle to making proof-checking a useful
tool is our lack of knowledge of how to express the specifications of
programs. Many programs have useful partial specifications - they shouldn't
loop or modify storage that doesn't belong to them. A few satisfy algebraic
relations, and this includes compilers. However, programs that interact
with the world have specifications that involve assumptions about the
world. AI programs in general are difficult to specify; most likely
their very specification involves default and other non-monotonic reasoning.
(See McCarthy 1980).
.<<remarks>>
.skip 2
.bb Mysteries and other Matters
.item←0
#. Daniel Friedman and David Wise have argued that ⊗cons should
not evaluate its arguments and have shown that this allows certain
infinite list structures to be regarded as objects. Trouble is
avoided, because only as much of the infinite structure is created as
is necessary to get the answers to be printed. Exactly what domain
of infinite list structures is assumed is unclear to me.
While they give interesting examples of applications, it isn't clear
whether the proposed extension has practical value.
#. Many people have proposed implementations of full lambda calculus.
This permits higher level functions, i.e. functions
of functions of functions etc., but allows only manipulations based
on composition and lambda conversions, not general manipulations
of the symbolic form of functions. While
conditional expressions are not directly provided, they can be imitated
by writing (as proposed by Dana Scott in an unpublished note)
%3true%1 as λ%2x_y.x, %3false%1 as λ⊗x_y.y and
%2qif_p_qthen_a_qelse_b%1 as ⊗p(a)(b). Another neat idea of Scott's
(adapted from one of Church) is to identify the natural number ⊗n with
the operation of taking the %2(n+1)%1th element of a list.
The mystery is whether extension to lambda calculus
has any practical significance, and the current best guess is no, although
the Scott's notational idea suggests changing the notation of LISP and
writing 0 for ⊗car, 1 for ⊗cadr, 2 for ⊗caddr, etc.
#. Pure LISP would be much simpler conceptually if
all list structure were represented uniquely in memory. This can
be done using a hash ⊗cons, but then ⊗rplaca and friends don't
work. Can't we somehow have the best of both worlds?
#. It seems to me that LISP will probably be superseded for
many purposes by a language that does to LISP what LISP does to
machine language. Namely it will be a higher level language than
LISP that, like LISP and machine language, can refer to its own
programs. (However, a higher level language than LISP might
have such a large declarative component that its texts may not
correspond to programs. If what replaces the interpreter is
smart enough, then the text written by a user will be more like
a declarative description of the facts about a goal and the means
available for attaining it than a program per se).
An immediate problem is that both the
kinds of abstract syntax presently available
and present pattern matching systems
are awkward for manipulating expressions containing bound
variables.
.<< References>>
.skip 2
.bb References
%3Boyer, Robert S. and J. Strother Moore (1979)%1: %2A Computational Logic%1,
Academic Press, New York, xiv + 397 pp.
%3Cartwright, R.S. (1977)%1:
%2A Practical Formal Semantic Definition and Verification System
for Typed Lisp%1,
Ph.D. Thesis, Computer Science Department, Stanford University,
Stanford, California.
%3Cartwright, Robert and John McCarthy (1979)%1:
"Recursive Programs as Functions in a First Order Theory",
in %2Proceedings of the International Conference on Mathematical Studies of
Information Processing%1, Kyoto, Japan.
%3Kleene, Stephen C. (1952)%1: %2Introduction to Metamathematics%1,
D. Van Nostrand Co.
%3McCarthy, J. (1960)%1: "Recursive Functions of Symbolic Expressions and their
Computation by Machine," %2Comm. ACM%1, April 1960.
%3McCarthy, J. and Paul Abrahams, Daniel Edwards, Timothy
Hart and Michael Levin (1962)%1:
%2LISP 1.5 Programmer's Manual%1, M.I.T. Press, Cambridge, Mass.
%3McCarthy, J. (1962)%1: "Computer Programs for Checking Mathematical Proofs",
%2Amer. Math. Soc. Proc. of Symposia in Pure Math.%1, Vol. 5.
%3McCarthy, J. (1963)%1: "A Basis for a Mathematical Theory of Computation",
in P. Braffort and D. Hirschberg (eds.), %2Computer Programming and
Formal Systems%1, North-Holland Publishing Co., Amsterdam, pp. 33-70.
%3McCarthy, J. (1977)%1:
"History of LISP", in Proceedings of the ACM Conference on the
History of Programming Languages, Los Angeles.
%3McCarthy, John (1980)%1:
"Circumscription - A Form of Non-Monotonic Reasoning", %2Artificial
Intelligence%1 forthcoming.
%3McCarthy, John and Carolyn Talcott (1980)%1: %2LISP - Programming and
Proving%1, course notes, Stanford University. (to be published as a book).
%3Weinreb, Daniel and David Moon (1978)%1: %2Lisp Machine Manual%1,
M.I.T. Artificial Intelligence Laboratory.
.if false then begin "supplement"
1. compilers and interpreters should handle tail recursion properly
2. Why do we need side-effect functions rather than frank operations?
Embedded SETQs are certainly convenient, but are they merely a few
cases.
3. Pattern directed computation inhabits a few special places
like macro-expansion rather than being a general facility.